Deep Learning for medical images classification

Description of a Convolutional Neural Network

Presentation of the CNN chosen

Overview of Transfer Learning

Data sets we used

What we did

  • Fixed Feature Extraction
  • Pre-trained models
  • Fine-tuning
  • Histogram equalization
  • Data augmentation

Final results

Conclusion

Convolutional Neural Network

cnn_view.png

3 major parts that compose a CNN

  • Convolution
  • Pooling
  • Fully connected layer

Convolution

convolution.jpg

Pooling

pooling.png

Fully connected layer

fully_connected_layer.jpg

VGG-19 overview

vgg19.jpg

Transfer Learning

transfer_learning.png

Chest_xray

  • train set: (5221, 2)
  • test set: (624,2)
  • val set(16,2)
In [9]:
show_chest_xray()

Cancer cells

  • train set: (72, 2)
  • test set: (26,2)
  • val set(26,2)
In [4]:
show_cancer_cells()

Mini MIT Etus

  • train set: (120, 3)
  • test set: (120, 3)
In [23]:
show_miniMIT()

Kvasir version 2

  • train set: (4800, 8)
  • test set: (1600, 8)
  • val set(1600, 8)
In [19]:
show_kvasir_v2()

Fixed Feature Extraction

  • CNN Codes
  • Transform a fully connected layer to fully convolutional layer
  • Results using VGG-19 + SVM
  • VLAD method

CNN Codes

  • In general, more convolutional layers --> more complicated features can be represented.
cancer cell microscope photo
feature extracted from block1_conv1
feature extracted from block5_conv1
chen_1.png chen_2.png chen_3.png

CNN codes

feature extracted from block1_conv1
feature extracted from block5_conv1
chen_4.png chen_5.png

To fully convolutional neural network

  • VGG-19 image size: (224, 224).
  • Image size wanted: (224 + 320 n, 224+320 n), where n = 0, 1, 2
original VGG-19
fully convolutional VGG-19
vgg19.jpg chen_6.png

Result using VGG19 + SVM

VGG19 + SVM

chen_7.png chen_8.png
chen_9.png chen_10.png

Representation with GradCAM

Cancer cell GradCAM representation on layer block5_pool
chen_11.png
Non-cancer cell GradCAM representation on layer block5_pool
chen_12.png

VLAD method


Principle

chen_13.png

VLAD method


Result

chen_16.png

Fine-tuning the VGG-19 network

  • Take a pre-trained model and try to find the best layers to train again

vgg19.jpg

Goal function used

log_loss.png

  • Training for 300 epochs with an early stopping callback fixed at 10
  • Optimizer Adam with learning rate 1e-6

Results on Chest X-ray

vgg19_finetuning_chestxray_results.png

vgg19_finetunning_chestxray_history.png

Let's try to visualize what we predict

gradcam_vgg19_1.png gradcam_vgg19_2.png

Looks like we are mainly predicting chest with pneumonia/normal rather than pneumonia/normal !

Results on other data sets

vgg19_finetuning_results.png

Let's see an interesting thing from mini MIT data set

gradcam_mit.png

GradCAMs for layer : block5_pool

gradcam_mit2.png

What we can conclude about our exploration

  • pre-trained models performs better on larger data sets,
  • pre-trained models, even on larger data sets, are overfitted

  • If the categories for our task of interest exist in the original data set used to train the ConvNet, our model can be qualified as strong because it can motivates its output .

  • On the opposite side, our model is weak even with excellent metric results.

Thus, we decide to explore a way to improve our network in terms of interpretation

gradcam_block1.png gradcam_block2.png

gradcam_block3.png gradcam_block4.png

gradcam_block5.png

Xception network and Depthwise convolutions

  • Xception network by François Chollet (creator of Keras)

  • Depthwise convolution decrease the computing time with nearly the same result as a normal convolution

    • Main difference is that we transform an image once then elongate it to the number of channels desired
  • Based on VGG-19 first 3 blocks, we construct 2 blocks of Depthwise convolution followed by a normalization to prevent overfitting

custom_cnn_results.png

gradcam_differences.png

  • Cool ! We start to look inside chest to find pneumonia...

Histogram equalization

clahe.png

  • No significant improvements
  • May be combined with raw image in order to increase performance

Data augmentation

data_augmentation.png

Data augmentation improvements

data_augmentation_results.png

Conclusion

  • We used 4 data sets of various properties to chose when, how and which Transfer Learning scenarios to use for a image classification project

  • The most important ones are the size of the data set and its similarity to the original data set

  • If data set is large, fine-tuning of a pre-trained network is probably the best idea

  • If the data data is small, the best idea might be to train a linear classifier on the fully connected layers extracted from a pre-trained network.

    • Applying VLAD to the features extracted from the last convolutional layer would improve the accuracy.
  • In case the data set you use is very similar to the ImageNet data set, you should just use the best pre-trained network available nowdays

  • We found that even an excellent neural network in regards of metrics can be useless if its output isn't relevant for humans. This is why we strongly encourage to always check what a neural network predicts instead of looking for the best metric results